Using Heuristics and Genetic Algorithms for Large-scale Database Query Optimization
نویسنده
چکیده
Distributed database system technology is one of the major developments in information technology area. It will continue to have a very significant impact on data processing in the upcoming years because distributed database systems have many potential advantages over centralized systems for geographically distributed organizations. The continuing interest in distributed database systems in the research community and the marketplace and the introduction of many commercial products indicate that distributed database systems will play a more important role in data processing and eventually will replace centralized systems as the major database technology in the future. The availability of high speed communication networks and, especially, the phenomenal popularity of the Internet and the intranets will undoubtedly speed up the transition process. Some challenging problems must be solved before the full potential benefits of distributed database technology can be realized. Among them is query processing (including query optimization), one of the most important issues in distributed database system design. The query optimization problem in large-scale distributed databases is NP-hard in nature and difficult to solve. In this study, the query optimization problem is reduced to a join ordering problem similar to a variant of traveling salesman problem. We explored several heuristics and a genetic algorithm for solving the join ordering problem. Some computational experiments on these algorithms were conducted and solution qualities compared. The computation experiments show that heuristics and genetic algorithms are viable methods for solving query optimization problem in large scale distributed database systems.
منابع مشابه
Relational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملScheduling Problem of Virtual Cellular Manufacturing Systems (VCMS); Using Simulated Annealing and Genetic Algorithm based Heuristics
In this paper, we present a simulated annealing (SA) and a genetic algorithm (GA) based on heuristics for scheduling problem of jobs in virtual cellular manufacturing systems. A virtual manufacturing cell (VMC) is a group of resources that is dedicated to the manufacturing of a part family. Although this grouping is not reflected in the physical structure of the manufacturing system, but machin...
متن کاملAddressing a fixed charge transportation problem with multi-route and different capacities by novel hybrid meta-heuristics
In most real world application and problems, a homogeneous product is carried from an origin to a destination by using different transportation modes (e.g., road, air, rail and water). This paper investigates a fixed charge transportation problem (FCTP), in which there are different routes with different capacities between suppliers and customers. To solve such a NP-hard problem, four meta-heur...
متن کاملScheduling Problem of Virtual Cellular Manufacturing Systems (VCMS); Using Simulated Annealing and Genetic Algorithm based Heuristics
In this paper, we present a simulated annealing (SA) and a genetic algorithm (GA) based on heuristics for scheduling problem of jobs in virtual cellular manufacturing systems. A virtual manufacturing cell (VMC) is a group of resources that is dedicated to the manufacturing of a part family. Although this grouping is not reflected in the physical structure of the manufacturing system, but machin...
متن کاملSolving Re-entrant No-wait Flexible Flowshop Scheduling Problem; Using the Bottleneck-based Heuristic and Genetic Algorithm
In this paper, we study the re-entrant no-wait flexible flowshop scheduling problem with makespan minimization objective and then consider two parallel machines for each stage. The main characteristic of a re-entrant environment is that at least one job is likely to visit certain stages more than once during the process. The no-wait property describes a situation in which every job has its own ...
متن کامل